WHAT IS CLAIMED IS 



1 . A method of engineering one or more binding macromolecules to adequately bind to a 
5 selected target macromolecule, wherein macromolecules comprise linear polymers of 

monomers, said method comprising: 

providing, as a first candidate binding macromolecule, a precursor macromolecule 
which binds to one or more terminal portions of an initial target macromolecule, 

deriving alternative candidate binding macromolecules by replacing one or more 
10 monomers of a current candidate with new monomers, wherein the new monomers are 
selected by rational engineering methods so that the alternative candidates are predicted to 
bind with one or more terminal portions of the selected target macromolecule, and wherein 
the rational engineering methods depend on 

(i) conceming the selected target macromolecule, input data comprising 
15 representations of one or more of its terminal monomer sequences, and 

(ii) conceming the precursor macromolecule, input data comprising 
representations of its monomer sequence and the monomer sequences of one or more of the 
terminal portions of the initial target macromolecule bound by the precursor, 

screening the alternative candidates for new candidates with improved estimated 
20 binding to terminal portions of the selected target, wherein the binding is estimated by 
rational methods in dependence on the input data, and 

repeating, if necessary, the steps of deriving and screening until the estimated 
binding of one or more candidates is adequate, whereby one or more candidate 
macromolecules are engineered to bind to one or more terminal portions of the selected 
25 target macromolecule. 

2. The method of claim 1 wherein the input data conceming the precursor macromolecule 
further comprises representations of its three-dimensional (3D) stmcture. 

30 3. The method of claim 1 wherein the input data conceming the selected target 
macromolecule consists essentially of representations of one or more of its terminal 
monomer sequences. 

4. The method of claim 1 further comprising the step of synthesizing one or more of the 
35 candidate binding macromolecules having adequate evaluated binding. 
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5. The method of claim 4 further comprising: 

measuring the actual binding of the synthesized candidate macromolecules to the 
selected target macromolecule, and 

repeating, if necessary, the steps of deriving, screening, and repeating and the further 
5 steps of synthesizing and measuring, until the measured actual binding of the synthesized 
candidates is adequate. 

6. The method of claim 1 wherein binding is adequate if the dissociation constant (KJ of a 
synthesized candidate macromolecule from the selected target macromolecule is less than a 

1 0 preselected maximum value. 

7. The method of claim 6 wherein the maximum value is less than approximately 1 mM, or 
less than approximately 100 jaM. 

/' ' 

15 8. The method of claim 1 wherein the rational engineering or estimation methods comprise 
one or more computer-assisted molecular design (CAMD) methods. 

9. The method of claim 1 wherein the steps of deriving and screening are repeated at least 
twice, wherein the rational engineering or estimating methods used in the later repetitions 

20 comprise more accurate methods, and wherein the rational engineering or estimating 
methods used in the earlier repetitions comprise less accurate methods, 

10. The method of claim 1 wherein the monomers are amino acids, and wherem the 
macromolecules are peptides or polypeptides. 

25 

11. A method of engineering one or more binding polypeptides to adequately bind to a 
selected target polypeptide comprising: 

providing, as a first candidate binding polypeptide, a precursor polypeptide which 
binds to one or more terminal peptide sequences of an initial target polypeptide, 

30 deriving alternative candidate binding polypeptides by replacing one or more amino 

acid residues of a current candidate with new residues, wherein the new residues are 
selected by rational engineering methods so that the alternative candidates are predicted to 
bind with one or more terminal peptide sequences of the selected target polypeptide, and 
wherein the rational engineering methods depend on 

35 (i) concerning the selected target polypeptide, input data comprising 

representations of one or more of its terminal peptide sequences, and 
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(ii) concerning the precursor polypeptide, input data comprising 
representations of its amino acid sequence, of the amino acid sequences of one or more of 
the terminal peptide sequences of the initial target polypeptide bound by the precursor, 
screening the alternative candidates for new candidates with improved estimated 
5 binding to terminal peptide sequences of the selected target, wherein the binding is 
estimated by rational methods in dependence on the input data, and 

repeating, if necessary, the steps of deriving and screening until the estimated 
binding of one or more candidates is adequate, whereby one or more candidate 
polypeptides are engineered to bind to one or more terminal peptide sequences of the 
10 selected target polypeptide. 

12. The method of claim 1 Kwherein the precursor polypeptide comprises one or more 
PDZ-type domains, or one or more TPR-type domains, or one or more proline-specific- 
peptidase-type domains, or one or more class-H-MHC-protein-type domains. 

15 

/ 

13. The method of claim 1 1 wherein polypeptides comprise peptides having peptide 
sequences with lengths of less than approximately 20, or 15, or 10, or 5 residues. 

14. The method of claim 13 wherein the one or more terminal peptide sequences of the 
20 selected target polypeptide comprise either its N-terminal or its C-terminal peptide 

sequence, or both. 

15. The method of claim 1 1 wherein the input data concerning the precursor polypeptide 
further comprises representations of its three-dimensional (3D) structure. 

25 

16. The method of claim 1 1 wherein the input data concerning the selected target 
polypeptide consists essentially of representations of one or more of its terminal peptide 
sequences. 

r 
f 

30 17. The method of claim 1 1 further comprising the step of synthesizing one or more of the 
candidate binding polypeptides having adequate evaluated binding. 

18. The method of claim 17 wherein the step of synthesizing further comprises expression 
by recombinant genetic engineering methods or synthesis by chemical means. 

35 
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19. The method of claim 17 wherein the step of synthesizing further comprises synthesizing 
one or more candidate binding polypeptides fused to one or more other polypeptides. 

20. The method of claim 19 wherein the one or more other polypeptides occur in a cell of 
5 an organism. 

21 . The method of claim 19 wherein the one or more other polypeptides comprise green 
fluorescent protein. 

10 22. The method of claim 17 further comprising: 

measuring the actual binding of the synthesized candidate polypeptides to the 
selected target polypeptide, and 

repeating, if necessary, the steps of deriving, screening, and repeating and the further 
steps of synthesizing, and measuring, until the measured actual binding of the synthesized 
15 candidates is adequate. 

23. The method of claim 2Z wherein the step of measuring further comprises performing 
affinity chromatography, or biosensor analysis, or micro-calorimetry. 

20 24. The method of claim 22Avherein the step of measuring further comprises performing a 
yeast two-hybrid assay, or a phage display assay, or RNA-protein fusions. 

25. The method of claim 22''further comprising the step of measuring binding specificity by 
measuring the actual binding of the synthesized candidate polypeptides to a plurality of 

25 polypeptides different fr*om the selected target polypeptide. 

26. The method of claim ll wherein binding is adequate if the dissociation constant (KJ of 
a synthesized candidate polypeptide fi-om the selected target polypeptide is less than a 
preselected maximum value. 

30 

27. The method of claim 26 wherein the maximum value is less than approximately 1 mM, 
or less than approximately 100 ^iM. 

28. The method of claim T 1 wherein the rational engineering or estimating methods for 

35 polypeptides further comprise methods based on by a priori chemical or physical principles, 
or on rules derived from empirical knowledge, or on knowledge in the art. 
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29. The method of claim 28 further comprising storing information relating to previously 
engineered binding polypeptides to supplement the empirical knowledge or the knowledge 
in the art. 

5 30. The method of claim 28 wherein the rational engineering or estimating methods based 
on principles further compnse one or more computer- assisted molecular design (CAMD) 
methods for polypeptides. 

31. The method of claim 30 wherein the CAMD methods for polypeptides comprise 

10 methods which approximate side-chain conformations by rotamers from a rotamer- library, 
and which approximate polypeptide backbone conformations by an inverse-folding 
approach in dependence on a known 3D structure. 

32. The method of claim 31 wherein the CAMD methods comprises a Perla method. 

15 

33. The method of claim 28 wherein the rational engineering or estimating methods based 
on rules further comprise rules derived from examples of sequence homology with known 
peptide-sequence-binding polypeptides, or derived from examples of polypeptides that bind 
to peptide sequences, or derived from examples of chimeric polypeptides formed from 

20 known peptide-sequence-binding polypeptides. 

34. The method of claim 33'^wherein the rules express peptide-sequence-binding 
specificities of peptide-sequence-binding polypeptides, or wherein the rules express how 
peptide-sequence-binding specificities of polypeptides may be modified. 

25 

35. The method of claim 28 wherein the rational engineering or estimating methods based 
on common knowledge further comprise rules for classifying amino acids into types with 
similar physical and chemical properties. 

30 36. The method of claim 1 1 wherein the steps of deriving and screening are repeated at 
least twice, wherein the rational engineering or estimating methods used in the later 
repetitions comprise more accurate methods, and wherein the rational engineering or 
estimating methods used in the earlier repetitions comprise less accurate methods. 

35 37. The method of claim 1 1 wherein the precursor polypeptide binds to two or more 
terminal peptide sequences of the initial target polypeptide, and wherein the step of 
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repeating, if necessary, repeats the steps of deriving and screening until the estimated 
binding of one or more candidates to two or more terminal peptide sequences of the selected 
target polypeptide is adequate. 

5 38. The method of claim 37 wherein the two or more terminal peptide sequences of the 
selected target polypeptide comprise both the N-terminal and the C-terminal peptide 
sequences, whereby one or more candidate polypeptides are engineered to bind bivalently 
to the selected target polypeptide. 

10 39. The method of claim 1 1 further comprising a step of performing methods of nuclear 
magnetic resonance spectroscopy or x-ray crystallography to obtain structural data 
concerning one or more candidates, and wherein the rational engineering and estimating 
methods further depend on input of this structural data. 

15 40. A method of engineering one or more binding polypeptides to adequately bind to a 
selected target polypeptide, the method comprising: 

providing, as a first candidate binding polypeptide, a precursor polypeptide which 
binds to one or more N-terminal peptide sequences of an initial target polypeptide, wherein 
a peptide sequence has a length of less than approximately 20, or 15, or 10, or 5 residues, 
20 deriving alternative candidate binding polypeptides by replacing one or more amino 

acid residues of a current candidate with new residues, 

screening the altemative candidates for new candidates with improved binding to N- 
terminal peptide sequences of the selected target polypeptide, and 

repeating, if necessary, the steps of deriving and screening until the binding of one 
25 or more candidates is adequate. 

41 . The method of claim 40 wherein the precursor polypeptide comprises one or more 
PDZ-type domains, or one or more TPR-type domains, or one or more proline-specific- 
peptidase-type domains, or one or more class-U-MHC-protein-type domains. 

30 

42. The method of claim 40 wherein the steps of deriving and screening further comprise 
rational engineering or estimation methods for polypeptides based on by a priori chemical 
or physical principles, or on rules derived from empirical knowledge, or on knowledge in 
the art. 

35 
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43. The method of claim 42 wherein the precursor polypeptide has a known three- 
dimensional (3D) structure. 

44. The method of claim 40 wherein the steps of deriving and screening further comprise 
5 synthesizing one or more of the alternative candidate binding polypeptides. 

45. A computer system for engineering one or more binding polypeptides from a selected 
precursor polypeptide, wherein the precursor polypeptide binds to one or more terminal 
peptide sequences of an initial target polypeptide, and wherein the binding polypeptides 

10 adequately bind to a selected target polypeptide, the system comprising: 

a processor, and 

a memory accessible to the processor, wherein the memory is configured with 

(a) data for representing the precursor polypeptide, the initial target 
polypeptide, the selected target polypeptide, and further candidate polypeptides, and 

1 5 wherein 

(i) the data for representing the selected target polypeptide comprises 
data representing one or more of its terminal peptide sequences, and 

(ii) the data representing the precursor polypeptide comprises data 
representing its amino acid sequence and of the amino acid sequences of the one or more 

20 terminal peptide sequences of the initial target polypeptide bound by the precursor, and 

(b) instructions for causing the processor, in dependence on the represented 
data, to perform the steps of 

(i) rational engineering methods for deriving alternative candidate 
binding polypeptides by replacing one or more amino acid residues of a current candidate 

25 with new residues so that the alternative candidates are predicted to bind with one or more 
terminal peptide sequences of the selected target polypeptide, 

(ii) rational binding-estimating methods for screening the altemative 
candidates for new candidates with improved estimated binding to terminal peptide 
sequences of the selected target, and 

30 (iii) repeating, if necessary, the steps of rational engineering and 

estimating until the estimated binding of one or more candidates is adequate, whereby one 
or more candidate polypeptides are engineered to bind to one or more terminal peptide 
sequences of the selected target polypeptide. 

35 
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46. The system of claim 45 wherein data represents polypeptides as comprising peptides 
having peptide sequences with lengths of less than approximately 20, or 15, or 10, or 5 
residues. 

5 47. The system of claim 45 wherein the data represents one or more terminal peptide 
sequences of the selected target polypeptide as comprising either its N-terminal or its C- 
terminal peptide sequence, or both. 

48. The system of claim 45 wherein the data representing the precursor polypeptide further 
10 comprises data representing its three-dimensional (3D) structure, and wherein the data 

representing the selected target polypeptide consists essentially of data representing its one 
or more of its terminal peptide sequences. 

49. The system of claim 45 wherein the selected precursor polypeptide binds to two or 
15 more terminal peptide sequences of the initial target polypeptide, and wherein the 

instructions for repeating, if necessary, cause the processor to repeat the steps of rational 
engineering and rational estimating until the estimated binding of one or more candidates to 
two or more terminal peptide sequences of the selected target polypeptide is adequate. 

20 50. The method of claim 49 wherein the two or more terminal peptide sequences of the 
selected target polypeptide comprise both the N-terminal and the C-terminal peptide 
sequences, whereby one or more candidate polypeptides are engineered to bind bivalently 
to the selected target polypeptide. 

25 51 . The system of claim 45 wherein the instructions for causing the processor to perform 
the steps of rational engineering or of rational binding-estimating fiirther comprise 
instructions for performing methods based on by a priori chemical or physical principles, or 
on rules derived from empirical knowledge, or on knowledge in the art. 

30 52. The method of claim 51 wherein the methods based on principles further comprise one 
or more computer-assisted molecular design (CAMD) methods for polypeptides. 

53. The method of claim 52 wherein the CAMD methods for polypeptides comprise 
methods which approximate sidp-chain conformations by rotamers from a rotamer-library, 
35 and which approximate polypeptide backbone conformations by an inverse-folding 
approach in dependence on a known 3D structure. 
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54. The method of claim 53 wherein the CAMD methods comprises a Perla method. 

55. The system of claim 52 wherein the instructions for causing the processor to perform a 
CAMD method further comprise instructions for performing two or more CAMD methods 

5 of increasing accuracy. 

56. A computer-readable medium with encoded instructions stored therein for causing a 
computer to perform the method of claim 45 

10 57. A polypeptide for binding to a selected target polypeptide engineered according to the 
method of claim 1 1 . - 

58. A vector for causing expression in a host cell of a polypeptide engineered for binding to 
a selected target polypeptide according to the method of claim 11.. 

15 

59. A cell comprising a nucleic acid sequence encoding a polypeptide engineered for 
binding to a selected target polypeptide according to the method of claim 11. 

60. The cell of claim 59 further comprising the engineered polypeptide. 

20 

61 . The cell of claim 59 further comprising the engineered polypeptide fused to a partner 
polypeptide comprising a peptide or a polypeptide. 

62. The cell of claim 61 wherein the partner polypeptide comprises a polypeptide sequence 
25 causing the localization of the fusion to a selected intracellular compartment. 

63. The cell of claim 61 wherein the partner polypeptide comprises a polypeptide sequence 
causing degradation of the fusion. 

30 64. The cell of claim 61 wherein the partner polypeptide comprises a label. 

65. The cell of claim 64 wherein the label is green fluorescent protein. 

66. A method for altering the function of a first cellular protein, which does not naturally 
35 bind to a second cellular protein, comprising: 
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providing a binding protein engineered according to the method of claim 11 for 
binding to the second cellular protein, and 

expressing the binding protein fiised to the first cellular protein so that first cellular 
protein as part of the fusion binds non-naturally to the second cellular protein. 

5 

67. A cell comprising a nucleic acid encoding a cellular protein altered according to claim 
66. 



68. A method for altering the function of a selected cellular protein, which naturally binds 
10 to an initial polypeptide, comprising: 

engineering the selected cellular protein to bind to a new target polypeptide 
according to the method of claim 1 1 , and 

expressing the engineered selected cellular protein. 

1 5 69. The method of claim 68 wherein the initial polypeptide is a part of a first cellular 
protein to which the selected cellular protein naturally binds, and wherein the new target 
polypeptide is part of a second cellular protein to which the selected cellular protein does 
not naturally bind. 

20 70 . The method of claim 68 wherein the selected cellular protein is an enzyme, and 
wherein the engineered selected protein has altered substrate specificity or enzymatic 
activity. 

71. A cell comprising a cellular protein with fianction altered according to claim 69. 

25 

72. A method for assaying for one or more target polypeptides in a sample comprising: 

contacting, in binding conditions, the sample with binding polypeptides, wherein 
one or more binding polypeptides are engineered by the method of claim 1 f to bind to one 
or more of the target polypeptides, and 
30 assaying for binding polypeptides bound to their respective target polypeptides, 

whereby the target polypeptides are assayed. 

73. The method of claim 72^rther comprising, before the step of contacting, a step of 
attaching one or more binding polypeptides to a substrate. 

35 
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74. The method of claim 72 wherein the step of assaying further comprises performing 
affinity chromatography, or biosensor analysis, or nuclear magnetic resonance spectroscopy, 
or micro-calorimetry. 

5 75. The method of claim 72 wherein the step of assaying further comprises performing a 
yeast two-hybrid assay, or a phage display assay, or RNA-protein fusion. 

76. The method of claim 72 wherein the one or more target polypeptides is a single target 
polypeptide. 

10 

77. A method of determining the cellular localization of a target protein comprising: 

providing a binding protein engineered by the method of claim 11 to bind to the 
target protein, 

contacting the cell with the binding protein under binding conditions, and 
1 5 assaying for the presence and location in the cell of the binding protein bound to the 

target protein. 

78. The method of claim 77 wherein the binding protein is fused to a polypeptide label. 

20 79. The method of claim 77 wherein the step of assaying further comprises performing an 
immimo-chemical method using antibodies to the binding protein. 

80. The method of claim 77 wherein the step of contacting comprises expressing the 
binding protein in the cell. 

25 

81. A method for assaying for target proteins in a sample from an organism comprising: 

contacting, in binding conditions, the sample with binding polypeptides, wherein the 
binding polypeptides bind to one or more terminal peptide sequences of a plurality of 
selected proteins expressed in the organism, and wherein the plurality of selected expressed 
30 proteins comprises more than 50 different proteins, and 

assaying for binding polypeptides bound to their respective target proteins, whereby 
the target proteins are assayed. 

82. The method of claim 81 further comprising a step of engineering the binding proteins to 
35 bind to the terminal peptide sequences of the selected plurality of proteins by the method of 

claim 1 1. 
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83. The method of claim 81 further comprising the step of expressing the binding proteins 
in members of one or more Ubraries of recombinant entities. 

84. The method of claim 81 wherein the terminal peptide sequences are N-terminal 

5 sequences, or C-terminal sequences, or both, having lengths less than approximately 1 5 
amino acids. 

/■ 

85. The method of claim 81 wherein the, plurality of selected proteins comprises more than 
500 or more than 5,000 different proteins. 

10 

86. The method of claim 81 wherein the plurality of selected proteins comprises less than 
5,000 or less than 50,000 different proteins. 

87. The method of claim 81 wherein the plurality of selected proteins comprises more than 
15 0.5% of the proteins expressed in a cell of the organism. 

88. The method of claim 81 wherein the plurality of selected proteins comprises less than 
50% or less than 80% of the proteins expressed in a cell of the organism. 

20 89. The method of claim 81 wherein the binding polypeptides are attached to one or more 
substrates. 

90. A library comprising recombinant organisms expressing a plurality of binding 
polypeptides, 

25 wherein each binding polypeptide binds to one or more terminal peptide sequences 

of each of a plurality of selected proteins expressed in an organism, and 

wherein the plurality of selected expressed proteins comprises more than 50 
different proteins. 

30 91 . The library of claim 90'^wherein the recombinant organisms comprise phage particles. 

92. The library of claim 90 wherein the recombinant organisms are comprise cells for use 
in a two-hybrid assay. 

35 
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93. The library of claim 90 wherein the terminal peptide sequences are N-terminal 
sequences, or C-terminal sequences, or both, having lengths less than approximately 1 5 
amino acids. 

5 94. The library of claim 90 wherein the plurality of selected proteins comprises more than 
500 different proteins. 

90 wherein the plurality of selected proteins comprises more than 
90 wherein the plurality of selected proteins comprises less than 

97. The library of claim 90 wherein the plurality of selected proteins comprises less than 
15 50,000 different proteins. 

98. The library of claim 90 wherein the plurality of selected proteins comprises more than 
0.5% of the proteins expressed in a cell of the organism. 

20 99. The library of claim 90 wherein the plurality of selected proteins comprises less than 
50% of the proteins expressed in a cell of the organism. 

100. The library of claim 90 wherein the plurality of selected proteins comprises less than 
80% of the proteins expressed in a cell of the organism. 

25 , 

101 . The library of claim 90 wherein the binding proteins are engineered to bind to the 
terminal peptide sequences of the selected plurality of proteins by the method of claim 1 1 . 

102. A polypeptide array comprising: 

30 a substrate with at least one surface, and 

a plurality of binding polypeptides regularly arranged on the surface, 

wherein each binding polypeptide binds to one or more terminal peptide 
sequences of each of a plurality of selected proteins expressed in an organism, and 

wherein the plurality of selected expressed proteins comprises more than 50 
35 different proteins. 



95. The library of claim 
5,000 different proteins. 

10 

96. The library of claim 
5,000 different proteins. 
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103. The array of claim 102 wherein the binding polypeptides are covalently attached to the 
surface. 

104. The array of claim 102 wherein the substrate comprises glass or plastic. 

5 

105. The array of claim 102 wherein the temiinal peptide sequences are N-terminal 
sequences, or C-terminal sequences, or both, having lengths less than approximately 1 5 
amino acids. 

10 106. The array of claim 102 wherein the plurality of selected proteins comprises more than 
500 different proteins. 

107. The array of claim 102 wherein the plurality of selected proteins comprises more than 
5,000 different proteins. 

108. The array of claim 102 wherein the plurality of selected proteins comprises less than 
5,000 different proteins. 

109. The array of claim 102 wherein the plurality of selected proteins comprises less than 
20 50,000 different proteins. 

110. The array of claim 102 wherein the plurality of selected proteins comprises more than 
0.5% of the proteins expressed in a cell of the organism. 

i 

25 111. The array of claim 102 wherein the plurality of selected proteins comprises less than 
50% of the proteins expressed in a cell of the organism. 

112. The array of claim 102 wherein the plurality of selected proteins comprises less than 
80% of the proteins expressed in a cell of the organism. 

30 

113. The array of claim 102 wherein the binding polypeptides are engineered to bind to the 
terminal peptide sequences of the selected plurality of proteins by the method of claim 1 1 . 

114. A polypeptide-RNA- fusion array comprising: 
35 a substrate with at least one surface, and 

a plurality of binding-polypeptide-RNA fusions regularly arranged on the surface. 
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wherein each binding polypeptide of the fusions binds to one or more 
terminal peptide sequences of each of a plurality of selected proteins, and 

wherein the RNAs of the fusions comprise sequences that encode for the 
corresponding fused binding polypeptides. 

5 

115. The array of claim 114 wherein the terminal peptide sequences are N-terminal 
sequences, or C-terminal sequences, or both, having lengths less than approximately 15 
amino acids. 

10 116. The array of claim 1 14 wherein the plurality of selected proteins comprises more than 
500 different proteins. 

117. The array of claim 114 wherein the plurality of selected proteins comprises less than 
5,000 different proteins. 

15 

118. A method of purifying one or more selected proteins from a sample comprising: 

providing one or more binding polypeptides that bind to one or more of the terminal 
peptide sequences of one or more selected proteins, wherein the binding polypeptides are 
engineered by the method of claim 1 1 , > 
20 contacting the sample in binding conditions with the binding polypeptides so that 

selected proteins in the contacted sample are bound to the binding proteins, 

washing the contacted sample in washing conditions so that unboimd proteins are 
removed while bound selected proteins remain, and 

eluting the washed sample in eluting conditions so that bound selected proteins are 
25 removed from the binding polypeptides, whereby the eluted selected proteins are purified 
from the sample. 

119. The method of claim 118 wherein the step of providing further provides the binding 
polypeptides attached to a substrate. 

30 

120. The method of claim 118 wherein the steps of contacting, washing, and eluting are 
performed by the methods of affinity chromatography. 

121. The method of claim 118 wherein at least two of the selected proteins are bound in a 
3 5 protein complex . 
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122. A method of engineering one or more binding macromolecules to adequately bind to a 
selected target macromolecule, wherein macromolecules comprise linear polymers of 
monomers, said method comprising: 

providing, as a first candidate binding macromolecule, a precursor macromolecule 
5 which binds to one or more tenninal portions of an initial target macromolecule, 

deriving alternative candidate binding macromolecules by replacing one or more 
monomers of a current candidate with new monomers, wherein the new monomers are 
selected by rational engineering methods so that the altemative candidates are predicted to 
bind with one or more terminal portions of the selected target macromolecule, and wherein 
10 the rational engineering methods depend on 

(i) concerning the selected target macromolecule, input data consisting 
essentially of representations of one or more of its terminal monomer sequences, and 
p (ii) concerning the precursor macromolecule, input data comprising 

li representations of its monomer sequence, its three-dimensional (3D) structure, and the 

I J 15 monomer sequences of one or more of the terminal portions of the initial target 

macromolecule bound by the precursor, . 
in screening the altemative candidates for new candidates with improved estimated 

" " binding to terminal portions of the selected target, wherein the binding is estimated by 

C3 rational methods comprising one or more computer-assisted molecular design (CAMD) 

20 methods in dependence on the input data, and 
f y repeating, if necessary, the steps of deriving and screening until the estimated 

= f binding of one or more candidates is adequate, wherein binding is adequate if the 

dissociation constant (K^) of a synthesized candidate polypeptide fi-om the selected target 
polypeptide is less than approximately 100 |iM, whereby one or more candidate 
25 macromolecules are engineered to bind to one or more terminal portions of the selected 
target macromolecule. 

123. A method of engineering one or more binding polypeptides to adequately bind to a 
selected target polypeptide comprising: 

30 providing, as a first candidate binding polypeptide, a precursor polypeptide which 

binds to one or more terminal peptide sequences of an initial target polypeptide, wherein the 
one or more terminal peptide sequences of an initial target polypeptide comprise either an 
N-terminal, or a C-terminal peptide sequence, or both, with lengths of less than 
approximately 1 5 residues, 

35 deriving altemative candidate binding polypeptides by replacing one or more amino 

acid residues of a current candidate with new residues, wherein the new residues are 
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selected by rational engineering methods so that the alternative candidates are predicted to 
bind with one or more terminal peptide sequences of the selected target polypeptide, 
wherein the one or more terminal peptide sequences of the selected target polypeptide 
comprise either an N-terminal, or a C-terminal peptide sequence, or both, with lengths of 
.5 less than approximately 15 residues, and wherein the rational engineering methods depend 
on 

(i) concerning the selected target polypeptide, input data consisting 
essentially of representations of one or more of its terminal peptide sequences, and 

(ii) conceming the precursor polypeptide, input data comprising 

10 representations of its amino acid sequence, of its three-dimensional (3D) structure, and of 

the amino acid sequences of one or more of the terminal peptide sequences of the initial 

target polypeptide bound by the precursor, 

screening the altemative candidates for new candidates with improved estimated 

binding to terminal peptide sequences of the selected target, wherein the binding is 
15 estimated by rational methods comprising one or more computer-assisted molecular design 

(CAMD) methods for polypeptides in dependence on the input data, and 

repeating, if necessary, the steps of deriving and screening until the estimated 

binding of one or more candidates is adequate, wherein binding is adequate if the 

dissociation constant (K^ of a synthesized candidate polypeptide from the selected target 
20 polypeptide is less than approximately 1 00 |iM, whereby one or more candidate 

polypeptides are engineered to bind to one or more terminal peptide sequences of the 

selected target polypeptide. 

124. The method of claim 123 wherein the CAMD methods for polypeptides comprise 
25 methods which approximate side-chain conformations by rotamers from a rotamer-library, 

and which approximate polypeptide backbone conformations by an inverse-folding 
approach in dependence on a known 3D structure. 

125. The method of claim 123 wherein the rational engineering or estimating methods 
30 further comprise rules derived from examples of sequence homology with known peptide- 

sequence-binding polypeptides, or derived from examples of polypeptides that bind to 
peptide sequences, or derived from examples of chimeric polypeptides formed from known 
peptide-sequence-binding polypeptides. 

35 126. The method of claim 123 wherein the terminal peptide sequences have lengths of less 
than approximately 10 residues. 
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127. A method of engineering one or more binding polypeptides to adequately bind to a 
selected target polypeptide comprising: 

providing, as a first candidate binding polypeptide, a precursor polypeptide which 
5 binds to two or more terminal peptide sequences of an initial target polypeptide, wherein the 
two or more terminal peptide sequences of an initial target polypeptide comprise either N- 
terminal or C-terminal peptide sequences, or both, with lengths of less than approximately 
15 residues, 

deriving altemative candidate binding polypeptides by replacing one or more amino 
1 0 acid residues of a current candidate with new residues, wherein the new residues are 

selected by rational engineering methods so that the altemative candidates are predicted to 
bind with two or more terminal peptide sequences of the selected target polypeptide, 
Q wherein the two or more terminal peptide sequences of the selected target polypeptide 

£ S comprise either N-terminal or C-terminal peptide sequences, or both, with lengths of less 

11 1 5 than approximately 1 5 residues, and wherein the rational engineering methods depend on 

rj (i) concerning the selected target polypeptide, input data consisting 

in essentially of representations of two or more of its terminal peptide sequences, and 

(ii) concerning the precursor polypeptide, input data comprising 
representations of its amino acid sequence, of its three-dimensional (3D) structure, and of 
J T 20 amino acid sequences of two or more of the terminal peptide sequences of the initial target 

I. y polypeptide bound by the precursor, 

I T screening the altemative candidates for new candidates with improved estimated 

binding to terminal peptide sequences of the selected target, wherein the binding is 
estimated by rational methods comprising one or more computer-assisted molecular design 
25 (CAMD) methods for polypeptides in dependence on the input data, and 

repeating, if necessary, the steps of deriving and screening until the estimated 
binding of two or more candidates is adequate, wherein binding is adequate if the 
dissociation constant (K^) of a synthesized candidate polypeptide from the selected target 
polypeptide is less than approximately 100 ^M, whereby one or more candidate 
30 polypeptides are engineered to bind to one or more terminal peptide sequences of the 
selected target polypeptide. 

128. A computer system for engineering one or more binding polypeptides from a selected 
precursor polypeptide, wherein the precursor polypeptide binds to one or more terminal 

35 peptide sequences of an initial target polypeptide, and wherein the binding polypeptides 
adequately bind to a selected target polypeptide, the system comprising: 
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a processor, and 

a memory accessible to the processor, wherein the memory is configured with 

(a) data for representing the precursor polypeptide, the initial target 
polypeptide, the selected target polypeptide, and further candidate polypeptides, and 

5 wherein 

(i) the data for representing the selected target polypeptide consists 
essentially of data representing one or more of its terminal peptide sequences, wherein the 
one or more terminal peptide sequences of the selected target polypeptide comprise either 
an N-terminal, or a C-terminal peptide sequence, or both, with lengths of less than 

10 approximately 1 5 residues, and 

(ii) the data representing the precursor polypeptide comprises data 
representing its amino acid sequence, its three-dimensional (3D) structure, and one or more 
of the amino acid sequences of the terminal peptide sequences of the initial target 
polypeptide bound by the precursor, wherein the one or more terminal peptide sequences of 

1 5 the initial target polypeptide comprise either an N-terminal or a C-terminal peptide 
sequence, or both with lengths of less than approximately 15 residues, and 

(b) instructions for causing the processor, in dependence on the represented 
data, to perform the steps of 

(i) rational engineering methods for deriving alternative candidate 
20 binding polypeptides by replacing one or more amino acid residues of a current candidate 

with new residues so that the alternative candidates are predicted to bind with one or more 
terminal peptide sequences of the selected target polypeptide, 

(ii) rational binding-estimating methods comprising one or more 
computer-assisted molecular design (CAMD) methods for polypeptides for screening the 

25 altemative candidates for new candidates with improved estimated binding to terminal 
peptide sequences of the selected target, and 

(iii) repeating, if necessary, the steps of rational engineering and 
estimating until the estimated binding of one or more candidates is adequate, wherein 
binding is adequate if the dissociation constant (K^) of a synthesized candidate polypeptide 

30 from the selected target polypeptide is less than approximately 100 jiM, whereby one or 
more candidate polypeptides are engineered to bind to one or more terminal peptide 
sequences of the selected target polypeptide. 

129. The method of claim 128 wherein the terminal peptide sequences have lengths of less 
35 than approximately 10 residues. 
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130 A polypeptide for binding to a selected target polypeptide engineered according to the 
method of claim 123. 

131. A method for assaying for one or more target polypeptides in a sample comprising: 
5 contacting, in binding conditions, the sample with binding polypeptides, wherein 

one or more binding polypeptides are engineered by the method of claim 123 to bind to one 
or more of the target polypeptides, and 

assaying for binding polypeptides bound to their respective target polypeptides, 
whereby the target polypeptides are assayed. 

10 

132. A polypeptide array comprising: 

a substrate with at least one surface, and 

a plurality of binding polypeptides regularly arranged on the surface, 

wherein each binding polypeptide binds to one or more N-terminal or C- 
15 terminal peptide sequences, or both having lengths less than approximately 15 amino acids 
of each of a plurality of selected proteins expressed in an organism, and 

wherein the plurality of selected expressed proteins comprises more than 500 
different proteins and less than 50,000 different proteins. 

20 133. A polypeptide array comprising: 

a substrate with at least one surface, and 

a plurality of binding polypeptides regularly arranged on the surface, 

wherein each binding polypeptide binds to one or more N-terminal or C- 
terminal peptide sequences, or both, having lengths less than approximately 15 amino acids 
25 of each of a plurality of selected proteins expressed in an organism, and ^ 

wherein the plurality of selected expressed proteins comprises more than 
5000 different proteins. 



30 
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